NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

https://doi.org/10.1088/2632-2153/acc0d7

Khoda, Elham E; Rankin, Dylan; Teixeira de Lima, Rafael; Harris, Philip; Hauck, Scott; Hsu, Shih-Chieh; Kagan, Michael; Loncar, Vladimir; Paikara, Chaitanya; Rao, Richa; et al (April 2023, Machine Learning: Science and Technology)

Abstract Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.
more » « less
Full Text Available
Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

https://doi.org/10.1088/2632-2153/ac9cb5

Ghielmetti, Nicolò; Loncar, Vladimir; Pierini, Maurizio; Roed, Marcel; Summers, Sioni; Aarrestad, Thea; Petersson, Christoffer; Linander, Hampus; Ngadiuba, Jennifer; Lin, Kelvin; et al (November 2022, Machine Learning: Science and Technology)

Abstract In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant for autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, we demonstrate a fully-on-chip deployment with a latency of 4.9 ms per image, using less than 30% of the available resources on a Xilinx ZCU102 evaluation board. The latency is reduced to 3 ms per image when increasing the batch size to ten, corresponding to the use case where the autonomous vehicle receives inputs from multiple cameras simultaneously. We show, through aggressive filter reduction and heterogeneous quantization-aware training, and an optimized implementation of convolutional layers, that the power consumption and resource utilization can be significantly reduced while maintaining accuracy on the Cityscapes dataset.
more » « less
Full Text Available
QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Pappalardo, Alessandro; Umuroglu, Yaman; Blott, Michaela; Mitrevski, Jovan; Hawks, Ben; Tran, Nhan; Loncar, Vladimir; Summers, Sioni; Borras, Hendrik; Muhizi, Jules; et al (January 2022, Fermi National Accelerator Lab)

Full Text Available
A Reconfigurable Neural Network ASIC for Detector Front-End Data Compression at the HL-LHC

https://doi.org/10.1109/TNS.2021.3087100

Guglielmo, Giuseppe Di; Fahim, Farah; Herwig, Christian; Valentin, Manuel Blanco; Duarte, Javier; Gingu, Cristian; Harris, Philip; Hirschauer, James; Kwok, Martin; Loncar, Vladimir; et al (August 2021, IEEE Transactions on Nuclear Science)
null (Ed.)
Full Text Available
Fast convolutional neural networks on FPGAs with hls4ml

https://doi.org/10.1088/2632-2153/ac0ea1

Aarrestad, Thea; Loncar, Vladimir; Ghielmetti, Nicolò; Pierini, Maurizio; Summers, Sioni; Ngadiuba, Jennifer; Petersson, Christoffer; Linander, Hampus; Iiyama, Yutaro; Di Guglielmo, Giuseppe; et al (July 2021, Machine Learning: Science and Technology)
null (Ed.)
Full Text Available
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

https://doi.org/10.3389/fdata.2020.598927

Iiyama, Yutaro; Cerminara, Gianluca; Gupta, Abhijay; Kieseler, Jan; Loncar, Vladimir; Pierini, Maurizio; Qasim, Shah Rukh; Rieger, Marcel; Summers, Sioni; Van Onsem, Gerrit; et al (January 2021, Frontiers in Big Data)
null (Ed.)
Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than one μs on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the hls4ml library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.
more » « less
Full Text Available
A new calibration method for charm jet identification validated with proton-proton collision events at √s = 13 TeV

https://doi.org/10.1088/1748-0221/17/03/P03014

Tumasyan, Armen; Adam, Wolfgang; Andrejkovic, Janik Walter; Bergauer, Thomas; Chatterjee, Suman; Dragicevic, Marko; Escalante Del Valle, Alberto; Fruehwirth, Rudolf; Jeitler, Manfred; Krammer, Natascha; et al (March 2022, Journal of Instrumentation)

Abstract Many measurements at the LHC require efficient identification of heavy-flavour jets, i.e. jets originating from bottom (b) or charm (c) quarks. An overview of the algorithms used to identify c jets is described and a novel method to calibrate them is presented. This new method adjusts the entire distributions of the outputs obtained when the algorithms are applied to jets of different flavours. It is based on an iterative approach exploiting three distinct control regions that are enriched with either b jets, c jets, or light-flavour and gluon jets. Results are presented in the form of correction factors evaluated using proton-proton collision data with an integrated luminosity of 41.5 fb -1 at √s = 13 TeV, collected by the CMS experiment in 2017. The closure of the method is tested by applying the measured correction factors on simulated data sets and checking the agreement between the adjusted simulation and collision data. Furthermore, a validation is performed by testing the method on pseudodata, which emulate various mismodelling conditions. The calibrated results enable the use of the full distributions of heavy-flavour identification algorithm outputs, e.g. as inputs to machine-learning models. Thus, they are expected to increase the sensitivity of future physics analyses.
more » « less
Full Text Available

Search for: All records